Semi-supervised Verb Class Discovery Using Noisy Features

نویسندگان

  • Suzanne Stevenson
  • Eric Joanis
چکیده

We cluster verbs into lexical semantic classes, using a general set of noisy features that capture syntactic and semantic properties of the verbs. The feature set was previously shown to work well in a supervised learning setting, using known English verb classes. In moving to a scenario of verb class discovery, using clustering, we face the problem of having a large number of irrelevant features for a particular clustering task. We investigate various approaches to feature selection, using both unsupervised and semi-supervised methods, comparing the results to subsets of features manually chosen according to linguistic properties. We find that the unsupervised method we tried cannot be consistently applied to our data. However, the semisupervised approach (using a seed set of sample verbs) overall outperforms not only the full set of features, but the hand-selected features as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic prediction of aspectual class of verbs in context

This paper describes a new approach to predicting the aspectual class of verbs in context, i.e., whether a verb is used in a stative or dynamic sense. We identify two challenging cases of this problem: when the verb is unseen in training data, and when the verb is ambiguous for aspectual class. A semi-supervised approach using linguistically-motivated features and a novel set of distributional ...

متن کامل

Clinically driven semi-supervised class discovery in gene expression data

MOTIVATION Unsupervised class discovery in gene expression data relies on the statistical signals in the data to exclusively drive the results. It is often the case, however, that one is interested in constraining the search space to respect certain biological prior knowledge while still allowing a flexible search within these boundaries. RESULTS We develop an approach to semi-supervised clas...

متن کامل

Supervised Learning Of Lexical Semantic Verb Classes Using Frequency Distributions

Vve zeport a number of computatmnal experiments m supervised learning whose goal Is to automatmally classify a set of verbs into lexmal semanUc classes, based on frequency dls tnbutmn approxlmatmns of grammatical features extracted from a very large annotated corpus DlstnbuUons of five syntactic features that approximate tranmUvlty alternatmns and thematic role assignments are sufficient to red...

متن کامل

A case study on supervised classification of Swedish pseudo-coordination

We present a case study on supervised classification of Swedish pseudocoordination (SPC). The classification is attempted on the type-level with data collected from two data sets: a blog corpus and a fiction corpus. Two small experiments were designed to evaluate the feasability of this task. The first experiment explored a classifier’s ability to discriminate pseudo-coordinations from ordinary...

متن کامل

Verb Class Discovery from Rich Syntactic Data

Previous research has shown that syntactic features are the most informative features in automatic verb classification. We investigate their optimal characteristics by comparing a range of feature sets extracted from data where the proportion of verbal arguments and adjuncts is controlled. The data are obtained from different versions of VALEX [1] – a large SCF lexicon for English which was acq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003